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Abstract 

Propagation of contagion in networks depends on the graph topology. This paper is concerned 
with studying the time-asymptotic behavior of the extended contact processes on static, undirected, 
finite-size networks. This is a contact process with nonzero exogenous infection rate (also known as 
the e-SlS, e susceptible-infected-susceptible, model IT]). The only known analytical characterization of 
the equilibrium distribution of this process is for complete networks. For large networks with arbitrary 
topology, it is infeasible to numerically solve for the equilibrium distribution since it requires solving the 
eigenvalue-eigenvector problem of a matrix that is exponential in N, the size of the network. We show 
that, for a certain range of the network process parameters, the equilibrium distribution of the extended 
contact process on arbitrary, finite-size networks is well approximated by the equilibrium distribution 
of the scaled SIS process, which we derived in closed-form in prior work. We confirm this result with 
numerical simulations comparing the equilibrium distribution of the extended contact process with that 
of a scaled SIS process. We use this approximation to decide, in polynomial-time, which agents and 
network substructures are more susceptible to infection by the extended contact process. 

This work is partially supported by AFOSR grant FA95501010291, and by NSF grants CCF1011903 and CCF1018509. 
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I. Introduction 

The contact process [|2| and its extension, the e-SIS (susceptible-infected-susceptible) model 
m, which we will refer to as the extended contact process in this paper, are models widely 
considered for describing the propagation dynamics of failures or epidemics in complex networks; 
the network describes and constrains the interactions and interdependencies between multiple 
agents/components in the system [0, [[51, [0. We call models of such phenomena network 
processes. Network processes extend traditional dynamical processes since the network substrate 
itself is a determinant of the observed dynamics. Except in special cases, such as when the 
network is a complete graph or the network is comprised of isolated nodes, it is a challenge to 
analyze how network topology influences the process dynamics. In network science, developing 
analyzable models that quantify the impact of topology on the behavior of network processes 
remains an open question. 

We are interested in understanding the role that topology plays on the time-asymptotic behav¬ 
iors of network processes. For continuous-time Markov processes such as the contact and the 
extended contacted processes, the time-asymptotic behavior is characterized by their equilibrium 
distribution (i.e., the limiting distribution of the process). The equilibrium distribution of the 
contact process on networks with N agents {N < oo) is trivial, because it assumes healing and 
infections are only due to contagion from infected neighbors dH. The contact process has an 
absorbing state; its equilibrium distribution is zero everywhere, except one at the absorbing state. 
References [j7|, [HI looked at the effect of topology on the time it takes for the process to reach 
steady state. 

Due to the inclusion of a non-zero exogenous infection rate, the extended contact process does 
not have a trivial equilibrium distribution. In general, to compute the equilibrium distribution for 
this network process requires solving for the left eigenvector corresponding to the zero eigenvalue 
of the 2^ X 2^ transition rate matrix, Qg; this is an infeasible computation problem for networks 
with more than a few agents. For large-scale networks, researchers usually approximate large- 
scale networks by infinite-size networks using the mean-field approximation [HI. We take a 
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different approaeh and show that, for a subelass of extended eontaet proeesses, their equilibrium 
distribution ean be approximated by that of the sealed SIS proeess, for whieh we found the 
elosed-form equilibrium distribution of the proeess on arbitrary, undireeted, finite-size network 
topology ffTOl . Unlike the extended eontaet proeess, which assumes that the infection rate of 
a healthy agent is linearly dependent on its number of infected neighbors, the scaled SIS process 
assumes an exponential dependence. 

We use this analytical characterization of the equilibrium distribution of the scaled SIS process 
as an approximation to the equilibrium distribution of the extended contact process. The paper 
shows that this approximation is appropriate for a range of endogenous infection rates of the 
extended contact process; it shows this range depends on the maximum degree of the underlying 
network. With numerical simulations, we confirm that, within this parameter range, the deviation 
between the true equilibrium distribution of the extended contact process and the approximation 
is very small (on the order of 10“®). Further, we observe from experiments that for certain 
network topologies, the approximation remains good even as the infection rate deviates from the 
established range. 

We use the equilibrium distribution to address important questions regarding the extended 
contact process like deciding which agents and network structures are more susceptible to infec¬ 
tion by solving for the most-probable configuration, configuration with maximum equilibrium 
probability. When the approximation holds, the most-probable configuration of the extended 
contact process is the same as the scaled SIS process, which we proved we can find in polynomial¬ 
time in [[Toll . 

In Section HU we review the contact process and the extended contact process. We review the 
scaled SIS process and compare and contrast it with the extended contact process in Section Hill 
In Section HYl we show a bound on the endogenous infection rate such that the equilibrium 
distribution of the extended contact process is well approximated by that of the scaled SIS 
process. We compare the true equilibrium distribution of the extended contact process with its 
approximation for six different 16-node networks using the total variation distance in Section IVl 
We discuss the most-probable configuration of the extended contact process and the approximate 
distribution in Section |Vll Section IVIII concludes the paper. 
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II. Contact Process 

The contact process models the spread of infection in a network [|2||. It is a binary state, 
irreducible, continuous-time Markov process on a static, simple, connected, undirected network 
G. See [fTTI . [fT^ for review of continuous-time Markov processes, [lT3l . ifT^ for review of graph 
theory. Each node in the network is an agent in the population. Each node can be in one of two 
states, {0,1}, representing, for example, healthy or infected state. Eor a system with N nodes, 
the microscopic network configuration is 

X = [ti, X 2 ,... where Xi = {0,1}. 

As a result, there are 2^ possible configurations. 

The contact process models SIS (susceptible-infected-susceptible) epidemics on networks. 
There are two types of state transitions representing 1) healing of infected agents and 2) infection 
of susceptible agents. 

1) Consider the configuration 

X = [ti, X2, . . . , Xj = 1, Xfc, . . . XnY'. 

Eet T“x be the configuration where the jth agent heals: 

T ~X = [xi, X2,..., Xj = 0, Xfc,... XNf- 

The contact process transitions from x to T~x in an exponentially distributed random 
amount of time with transition rate 

g(x,r"x) = /z. (1) 

Parameter /r is the healing rate. Without loss of generality, typically /x = 1. 

2) Consider the configuration 

X = [xi, X 2 ,..., Xj, Xfc = 0, . . . Xjv]^. 

Eet X be the configuration where the kih. agent becomes infected: 

X = [xi, X 2 ,..., Xj, Xfc = 1,.. . XnY- 
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The contact process transitions from x to in an exponentially distributed random 
amount of time with transition rate 

N 

9(^1 Tk X) = /^e 

i=l 

where A = [Aik] is the adjacency matrix of the underlying network. The parameter (3^ > 0 
is the endogenous infection rate. The infection rate of the kih. agent is assumed to be 
linearly dependent on its number of infected neighbors, fn = ^^^=1 XiAik- 

In the contact process, when all the agents in the network are healthy, the process dies out. 
The configuration where all the agents are healthy (x° = [0, 0,..., 0]^) is an absorbing state of 
the Markov process. For networks with N agents, the contact process will eventually reach the 
configuration x° and remain there indefinitely. Thus, the equilibrium distribution is trivial for 
contact processes on finite-size networks [^. 

A. Extended Contact Process 

In the contact process, a healthy agent can only become infected through contagion from an 
infected neighbor. It may be the case that a healthy agent (or working component) may also 
become infected (or fail) due to an exogenous (i.e., outside of the network) source —the agent 
is infected spontaneously ifTSll . ifTI . ifT^ . For SIS epidemics, this is captured by a non-zero 
exogenous infection rate, A. The transition rate of the extended contact process from x to T^x 
is 

N 

q{yi,T^yi) = \ +Ide^XiAik, (3) 

i=l 

where A is the adjacency matrix of the underlying network. The healing rate remains the same 
as O. We call this modified model the extended contact process, whereas [|TI| referred to it as 
the e-SIS model. When agent k has 0 infected neighbors, the rate at which agent k becomes 
infected is the exogenous infection rate. For a system where spontaneous infection is rare, the 
exogenous infection rate can be made arbitrarily small, but for the extended contact process, it 
has to remain greater than zero. 

The configuration where all the agents are healthy (x° = [0,0,..., 0]^) is no longer an absorb¬ 
ing state in the Markov process since susceptible agents can spontaneously become infected. As a 
result, the equilibrium distribution of the Markov process is no longer trivial. There is currently 
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no known tractable analytical results regarding this equilibrium distribution for the extended 
contact process for arbitrary network topologies; reference [[H provided the exact equilibrium 
distribution only for the complete graph. 

The equilibrium distribution can be calculated numerically. However, this approach is in¬ 
feasible for large networks. In the case of an irreducible, continuous-time Markov process, the 
equilibrium distribution, vr, is the left eigenvector of the transition rate matrix, Qe, corresponding 
to the 0 eigenvalue. However, the transition rate matrix is a 2^ x 2^ matrix, where N is the size 
of the network. Solving for the equilibrium distribution of the extended contact process over a 
200-node network with arbitrary topology means finding the eigenvector of a 2^*^° x 2^°° matrix; 
even taking into account sparsity, such computation is clearly infeasible. 

We will show in this paper that we can obtain an approximation to the equilibrium distribution 
over arbitrary network topologies for a subset of extended contact processes using the scaled 
SIS process. 


HI. Scaled SIS Process 

We introduced the scaled SIS process in llH, ifTTIl . Like the contact process, it is a binary state, 
irreducible, continuous-time Markov process on static, simple, connected, undirected networks. 
The scaled SIS process assumes that an agent can be in one of 2 states, i.e., {0,1}, representing 
an healthy or infected state. The space of possible network configurations of the scaled SIS 
process is the same as that of the contact and extended contact process. The scaled SIS process 
also accounts for two types of state transitions representing 1) healing of infected agents and 2) 
infection of susceptible agents. 

1) Consider the configuration 

X = [xi,X2, ...,Xj = l,Xk, ■ ..Xn]^. 

Let be the configuration where the jth agent heals: 

T ~X = [xi, X2,..., Xj = 0, Xfc, ... Xn^. 

The scaled SIS process transitions from x to T~x in an exponentially distributed random 
amount of time with transition rate 

g(x, T- x) = fi. (4) 
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The healing rate of the sealed SIS proeess is the same as the healing rate of the eontaet 
proeess. 

2) Consider the eonfiguration 

X = [xi, X2,..., Xj, Xfc = 0, . . . Xn]^. 

Let T^x be the eonfiguration where the kth agent beeomes infeeted: 

yi = [xi,X2,... ,Xj,Xk = I,... XNf. 

The sealed SIS proeess transitions from x to x in an exponentially distributed random 
amount of time with transition rate 

g(x,r+x) = A/3P"i"“^-, (5) 

where A is the adjaeeney matrix of the underlying network. The parameter > 0 is the 
endogenous infeetion rate. Unlike the infeetion rate of the extended eontaet proeess dH), 
the infeetion rate of the kth agent is assumed to be exponentially dependent on the number 
of infeeted neighbors, m = Yl!i=i When the number of infeeted neighbors of agent 

k is 0, the infeetion rate, like for the extended eontaet proeess, reduees to the exogenous 
infeetion rate A. 

We proved in [|9]|, ifTTl . that, for the sealed SIS proeess, the resulting eontinuous-time Markov 
proeess is a reversible Markov proeess; a reversible Markov proeess is a stoehastie proeess that 
is statistieally the same forward in time as it is in reverse [fT^ . The equilibrium distribution of a 
reversible Markov proeess is unique. The equilibrium distribution of the sealed SIS proeess over 
any undireeted network topology deseribed by the adjaeeney matrix. A, is found in [|9l|, IfTTl to 
be: 

7r(x) = - f-J , X G Af (6) 

where 1 is the veetor of all I’s, Z is the partition funetion, and A is the spaee of 2^ possible 
eonfigurations [fU, IfTTl . The equilibrium probability of a eonfiguration x depends on the number 
of infeeted agents, l^x, and on the number of edges where both end nodes are infeeted, • 
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A. Scaled SIS Process vs. Extended Contact Process 


The infection rate of a susceptible agent in both the extended contact process and the scaled SIS 
process depends on its number of infected neighbors. The two models make different assumptions 
regarding the underlying mechanism of the contagion process: 


© 


Fig. 1: Transition from Configuration x to Configuration Tg*"x 

Extended Contact Process 

The extended contact process is parameterized by the exogenous infection rate, A, the 
healing rate, //, and the endogenous infection rate /Sg. Consider the scenario in Figure [B 
Let Tg be the random amount of time it takes for agent V 3 to become infected. Each 
infected neighbors of agent V 3 (i.e., Vi, V2, V4) and the exogenous (i.e., external) source 
may infect V 3 in an exponentially distributed amount of time Tg ~ exp(/5e), i = 1 , 2 ,4, 
and Tg® ~ exp(A), respectively. Therefore, Tg = minjTg^, Tg^, Tg, Tg®}. Assuming that 
these sources act independently, then T 3 ~ exp (A + 3/3e). As the number of infected 
neighbors of V 3 increases, its infection rate also increases. The extended contact process 
models a distributed contagion scenario where all the infection sources compete to be 
the first to infect a healthy agent. 

Scaled SIS Process 

The scaled SIS process is parameterized by the exogenous infection rate. A, healing 
rate, /i, and the endogenous infection rate fSs- Consider the scenario in Figure [IJ Let 
T 3 be the random amount of time it takes for agent V 3 to become infected. As agent 
V 3 has three infected neighbors (i.e., lA, 14,14), the scaled SIS process assumes that 
~ exp(A(/5g)^), where T ~ exp (A) is the random amount of time a 
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healthy agent beeomes infeeted when it has no infected neighbors. When > 1, as 
the number of infected neighbors of increases, its infection rate also increases. Unlike 
the extended contact process, the scaled SIS process assumes an aggregate contagion 
scenario. 


IV. Time-Asymptotic Behavior oe the Extended Contact Process 

For finite-size networks, unlike the contact process, the equilibrium distribution of the extended 
contact process is nontrivial. In this section, we show that, for a subclass of extended contact 
processes over arbitrary network topology, this equilibrium distribution is well approximated by 
the equilibrium distribution of the scaled SIS process; for these processes, the time-asymptotic 
behavior of both processes are similar. 


Lemma IV.l. [Proof in Appendix For any nonnegative integer m from 0 to dmax. if 


« 


2 


^max(l^n 


1 )’ 


then 


-(1 + A)"* ^ - + -Am. 

p p p 


Using Lemma IIV. 1 [ we can prove the following theorem. 


Theorem IV.l. [Proof in Appendix^ Consider the extended contact process exogenous infection 
rate A, healing rate p, and endogenous infection rate jde, over a static, simple, connected, 
undirected network of arbitrary topology, G, with maximum degree dmax- Fet /3e = - A. 7/’ 


A^ << 


2 


dmaxida 


1 )’ 


then the equilibrium distribution of the extended contact process is well approximated by 

vrapp.ox(x) = - f-J (1 + A) 2 , X G A’, (7) 

where A is the adjacency matrix of the network G, and Z is the partition function. The 
approximate distribution, -napproxidF), is the equilibrium distribution ^ of a scaled SIS process 
over the same network G with exogenous infection rate A, healing rate p, and endogenous 
infection rate /?« = 1 -f A. 
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Theorem IIV. 1 1 gives an upperbound on the faetor, A, between the endogenous infection rate, /3e, 
and the ratio, of the exogenous infection rate. A, and the healing rate, /i. This bound depends 
on the maximum degree of the underlying network topology. When /3e is much smaller than 
then the equilibrium distribution, 7re(x), of the extended contact process is well approximated 
by that of an equivalent scaled SIS process. What does this imply about the extended contact 
process? 

Recall that, for the extended contact process, all infection sources are independent. Suppose 
that susceptible agent i has one infected neighbor. Let T? ~ exp(/9e) be the random amount of 
time it takes for susceptible agent i to be infected by this infected neighbor, and T/ ~ exp (A) be 
the random amount of time it takes for susceptible agent i to become infected by an exogenous 
source. The probability that the agent i is infected by the exogenous source rather than by its 
infected neighbors is 


P{Tt < T/) = 


A 


/?e + A 


+ r 


( 8 ) 


since (3^ = A A. (See Appendix O for a review regarding functions of exponentially distributed 
random variables.) Suppose that susceptible agent i has multiple (i.e., m > 1) infected neighbors. 
The probability that agent i is infected by the exogenous source rather than by its infected 
neighbors is 


F(7;^<min{T/,...7;-}) = 


A 


m/3e + A ^ + 1 


(9) 


Without loss of generality, let = 1. According to Theorem llV.li the scaled SIS process is a 
valid approximation for the extended contact process when A is small. In this case, according 
to ([8]) and dH), the probability that the source of infection is exogenous rather than endogenous 
is high; infection due to contagion from infected neighbor is rare but not impossible. 


V. Experimental Simulations 

We showed when the extended contact process can be well approximated by the scaled 
SIS process. We confirm Theorem IIV. 1 1 with numerical simulations. Further, we show that this 
upperbound is conservative. Below it, the equilibrium distribution of the extended contact process. 
7re(x), for arbitrary network topology is well approximated by the equilibrium distribution, 
7rapprox(x), of a scalcd SIS process. However, depending on the underlying network topology, the 
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approximation may still remain accurate (< 0.1 deviation) for extended contact processes with 
parameters away from the bound. 


A. Setup 


We will compare the true equilibrium distribution, 7re(x), of the extended contact process, with 
infection and healing rates ^A, p, fie = ^ A j over network G, with the approximation distribution, 
7rapprox(x). The true distribution, 7re(x), is found numerically by forming the transition rate 
matrix, Qe, according to ([B and dU) and solving for the left eigenvector of Qe corresponding to 
eigenvalue 0. The approximate distribution, 7rapprox(x), is obtained from the closed-form equation 
according to Theorem IIV.II 

Vrapprox(x) = - ( -j (1 + A)^, X G A’. 

We solve for 7re(x) and 7rappi.ox(x) for different ^ values and different A values, both below 
and above the upperbound. 



To quantify the difference between the exact and the approximation equilibrium distribution, 
7re(x) and 7rapprox(x), we use the total variation distance (TVD) [fl^ : 



(x)|. 


( 10 ) 


When the two distributions are equal, TVD is 0. The maximum TVD between any two probability 
distributions over the same support is 1. 

As the true distribution of the extended contact process, 7re(x), is obtained by solving the zero 
eigenvalue-eigenvector problem of Qe, which is a 2^x2^ matrix, we are restricted to simulating 
examples with small networks of size N. We consider six 16-node networks (see Figure O with 
different maximum degree, dmax. corresponding to different upperbounds A„. Networks A and 
B have the smallest possible maximum degree of any connected graph (dmax = 2); they have 
the largest possible upperbound (A„ = 1). Network F has the largest maximum degree of the 
networks studied (dmax = 15) and has the smallest upperbound (A„ = 0.098). 

In Matlab, on a Microsoft Azure cloud virtual machine with 2.6GHz Intel Xeon E5-2670 and 
56GB of RAM, for a 16-node network, it takes approximately 2 secs to generate the sparse 


July 3, 2015 


DRAFT 







13 


transition rate matrix Qe and 460 secs to solve for the eigenvector corresponding to the 0 
eigenvalue. For a 20-node network, it takes approximately 30 secs to generate the transition rate 
matrix Qg; we receive an OUT-OF-MEMORY error when computing the eigenvector. 



(a) Network A (b) Network B (c) Network C 

(dmax = 2, A„ = 1) (dmax = 2, A„ = 1) (dmax = 5, A„ = 0.32) 



Fig. 2: Different Network Topologies with Different Maximum Degree 


B. Results: 7re(x) and T^approxi^) 

To provide intuition on the quality of the approximations for different TVDs, we plot in 
Figure [3] the true equilibrium distribution, 7 re(x), of the extended contact process together 
with the approximate equilibrium distribution, 7 rapprox(x). The Y-axis displays both equilibrium 
distributions; we use log scaling for better visualization. The 2^® network configurations are on 
the X-axis. The configurations are ordered such that high probability configurations in 7 re(x) are 
in the center. 
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Figure [3] shows 7re(x) and 7rapprox(x) and their eorresponding TVD, see (fTOl) . for the six different 
network topologies (see Figure O for parameters ^ = 0.7, A = 0.0023. This value of A is mueh 
smaller than the upperbound, A„, for all the networks. The equilibrium distribution, 7re(x), of 
the extended eontaet proeess is well approximated (i.e., the TVD is on the order of 10“® or 
smaller) by equilibrium distribution, 7rapprox(x), of the sealed SIS proeess. Note that this value 
of TVD is over 2^® eonfigurations; so the aetual divergenee for any eonfiguration is very small. 
The two distributions are almost identieal for all the networks. 

We also eonsidered the ease when the eondition of Theorem IIV. 1 1 is not satisfied. Figure |4] 
shows 7re(x) and 7rapprox(x) and their eorresponding TVD for parameters ^ = 0.7, A = 1.0496. 
In this ease, the value of A is above the upperbound A^ for all the networks in Figure |2l As 
we expeet, TVD is larger when eompared to the TVD for proeesses with A well below A„ 
(see Figure [3]). Again, for the same infeetion and healing rates, different networks have different 
TVD values. 

For Networks A and B, the deviation between the true and approximate equilibrium distribu¬ 
tion, 0.1073 and 0.1186, respeetively, is relatively small. We see from Figure l4al and Figure |4b] 
that many eonfigurations have similar equilibrium probability for both distributions. For Network 
D and E, TVDs are 0.4798 and 0.7898, respeetively. These figures show that the approximate 
distribution tends to overestimate the probability of highly probable eonfigurations but underesti¬ 
mates the low probability eonfigurations. However, there is good eorrelation between the relative 
ordering of eonfigurations in both distributions; eonfigurations that are highly probable in 7re(x) 
are also highly probable in 7rapprox(x). 

C. Results: TVD vs. A and - 

We eonsider here the approximation of 7re(x) by 7rapprox(x) as the infeetion and healing rates 
ehange. Like the sealed SIS proeess, we ean interpret the extended eontaet proeess as eonsisting 
of a topology-independent proeess parameterized by -—it is topology independent beeause 
the exogenous infeetion rate A and the healing rate p are identieal for all the agents in the 
network—and a topology-dependent proeess parameterized by the endogenous infeetion rate jd. 
When - is large, the topology-independent proeess exerts a larger effeet on the equilibrium 
behavior of the network proeesses. 

Figure |5] shows the TVD between the equilibrium distribution, 7re(x), of the extended eontaet 
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— — Approximate Distribution 
True Distribution 



(a) Network A (TVD = 1.0384 x 10"®) 



(c) Network C (TVD = 3.2392 x 10"®) 



(e) Network E (TVD = 2.7487 x lO”®) 

Fig. 3: 7re(x) and 7rapprox(x) 


— — Approximate Distribution 
True Distribution 



Configuration, x 


(b) Network B (TVD = 1.1236 x IQ-^) 



(d) Network D (TVD = 5.0208 x 10“®) 



(f) Network F (TVD = 2.2729 x 10-^) 

when ^ = 0.7, A = 0.0023 
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(a) Network A (TVD = 0.1073) (b) Network B (TVD = 0.1186) 




(c) Network C (TVD = 0.294) 


(d) Network D (TVD = 0.4798) 




(e) Network E (TVD = 0.7898) 


(f) Network F (TVD = 0.1330) 


Fig. 4: 7re(x) and 7rapprox(x) when ^ = 0.7, A = 1.0496 
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process and the approximate distribution, 7 rapprox(x), for different ^ and A values both below 
and well above the threshold Au. Figure |5] eonsiders the six different network topologies in 
Figure |2l We plot A along the X-axis and the TVD between 7 re(x) and 7 rapprox(x) along the 
Y-axis. Different eurves in eaeh figure eorrespond to equilibrium distributions with different ^ 
values. 

For the same A value, we observe that larger ^ eorrespond to smaller TVD. This holds for 
all the networks. Also, as we expeet, for A << A^, TVD is negligible for all the networks. The 
deviation between the true equilibrium distribution and the approximation inereases as A moves 
toward A„; the rate of this inerease differs for different topologies. Surprisingly, this inerease 
is not monotonie for a// network topologies. As A inereases to values larger than A„, TVD 
may aetually deerease. We observe this deerease in TVD for both Network E in Figure [5e] and 
Network F in Figure [5l In partieular. Network F, whieh has the largest maximum degree of 
all the six networks, has relatively small deviation between 7 re(x) and 7 rapprox(x) eompare to the 
other network topologies. 

VI. Most-Probable Configuration 

We showed in Figures [3] and |4] that, for a range of the dynamie parameters, the equilibrium 
distribution 7 re(x) of the extended eontaet proeess is well approximated by the equilibrium dis¬ 
tribution 7 rapprox(x) of the sealed SIS proeess. In this seetion, we eonsider the problem of finding 
the most-probable eonfiguration (i.e., eonfiguration with the maximum equilibrium probability). 

For network proeesses, the most-probable eonfiguration depends on the infeetion and healing 
rates and on the underlying network topology. It identifies the set of agents that are most likely 
to be infeeted in the long run. These are the more vulnerable agents in the network. If the most 
probable eonfiguration is x° = [0, 0 ,... 0]^, all agents are healthy, whereas if the most-probable 
eonfiguration is x^ = [ 1 , 1 ,... 1 ]^, then all agents are at risk regardless of their loeation in the 
network. Exeept for these two eases, finding whieh agents are infeeted in the most-probable 
eonfiguration is not trivial. The most-probable eonfiguration of the extended eontaet proeess is 

X* = argmax7re(x), 

xgA 

where X is the set of all 2^ possible network eonfigurations. Eor the extended eontaet proeess, 
there is no elosed-form deseription of the equilibrium distribution, so, this problem ean only be 
solved numerieally, whieh is infeasible for large-seale networks. 
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Total Variation Distance vs. A 


Total Variation Distance vs. A 
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(b) Network B (A^ = 1) 

Total Variation Distance vs. A 




_ ___ * - 


i' 


t-- 




2 

A, 


3 

i Value 


.0_.A= 0.5 

-*-■ -p = 0-1' 

-^-■1 = 0.9 

- 1 = 1-1 

. i = 1.3 


(c) Network C (A^ = 0.32) 


(d) Network D (A„ = 0.32) 
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(f) Network F (A„ = 0.098) 


Fig. 5: Dependence of TVD(7re, vTapprox) on A 
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On the other hand, as stated in Theorem IIV.II when A << A„, the equilibrium distribution, 
7 re(x), of the extended eontaet proeess is well approximated by the equilibrium distribution, 
7 rapprox(x), of a soaled SIS proeess with endogenous infeetion rate = 1 + We proved in 
[fTOll that, in this ease, the most-probable eonfiguration of the sealed SIS proeess ean be solved 
in polynomial-time beeause it eorresponds to solving for the minimum of a submodular funetion. 
It is therefore possible to identify vulnerable network substructures for networks with hundreds 
and thousands of agents. 

From the simulation results in the previous section, we now compare the most-probable 
configuration of the extended contact process with the most-probable configuration of the ap¬ 
proximating scaled SIS process. Table U lists for the six networks in Figure [2l the TVD between 
the distributions, the corresponding most-probable configurations, and the probabilities of the 
most-probable configuration for ^ = 0.9744 and A = 0.02. We observe that when the condition 
of Theorem IIV. 1 1 is satisfied: 

1 ) the most-probable configuration, x*, of the extended contact process is the approximately 
the same as the most-probable configuration, x^ppj.^^, of the scaled SIS process; 

2 ) the probability of the most-probable configuration, 7re(Xg), of the extended contact process 
is the same as the probability of the most-probable configuration, 7rapprox(x*pprox)> of tho 
scaled SIS process. 



TVDfTTe, TTapprox) 

Xe 

X* 

-^approx 

TTefx*) 

TT approx ( ^approx ) 

Network A 

1.0236 X 

x° = [0,0,. 

•or 

x° = [0,0,.. 

.or 

1.7431 X 10"® 

1.7427 X 10“® 

Network B 

1.1027 X lO"** 

x° = [0,0,. 

■or 

x° = [0,0,.. 

.or 

1.7347 X 10"® 

1.7342 X 10“® 

Network C 

3.3806 X 10”'^ 

see Figure [6cl 

see Figure l6c] 

1.7107 X 10"® 

1.7154 X 10“® 

Network D 

5.2714 X 10-^ 

x^ = [l,l,. 

..ir 

x^ = [1,1,. 

■ ir 

1.8622 X 10"® 

1.8781 X 10“® 

Network E 

0.0031 

x^ = [l,l,. 

..ir 

x^ = [1,1,. 

■ ir 

3.1073 X 10"® 

3.277 X 10"® 

Network F 

0.0023 

x° = [0,0,. 

■or 

x^ = [0,0,. 

.or 

1.7419 X 10"® 

1.7389 X 10“® 


TABLE I: Most-Probable Configuration when ^ = 0.9744 and A = 0.02. 


For Networks A, B, and F, the most-probable configuration for both the extended contact 
process and the approximate scaled SIS process is x°, the configuration where all the agents 
are healthy. However, for the same infection and healing rate, the most-probable configuration 
for Networks D and E for both the extended contact process and the scaled SIS process is 
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(a) Network A 
{d{G) = 0.9375) 


(b) Network B 
(d(G) = 1) 


(c) Network C 
(d{G) = 1.1875) 



(d) Network D (e) Network E 

(d(G) = 1.75) ((i(G) = 4.125) 



Fig. 6: Most Probable Configuration when ^ = 0.9744 and A = 0.02 
(infected = black, healthy = white) 


x^, the configuration where all the agents are infected. Figure shows that the most-probable 
configuration for Network C is neither x° nor x^ but a configuration where nine agents are 
infected while seven agents are healthy; we call most-probable configurations that are neither 
x° nor x^ non-degenerate most-probable configurations. 

For an extended contact process with exogenous infection rate and healing rate, - = 0.9744, 
and endogenous infection rate, (3c = = 0.019488, the epidemic is minor in Networks A, B, 

and F, but should be of concern in Networks D and E. In Network C, a subset of agents are 
more at risk than others. Different networks have different risk levels because the propagation of 
contagious infection is dependent on the underlying network topology. The result in Figure [6^ 
confirms for the extended contact process what we proved for the scaled SIS process in [[TOll . 
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namely, that in the most-probable eonfiguration the infeeted agents belong to dense subgraphs 
in the network. Referenee [|T9ll defines density of a graph G by 

where |i?(G)| is the total number of edges and |1^(G)| is the total number of nodes. Networks 
that are more eonneeted have higher densities than sparsely eonneeted networks. 

Networks with high density, sueh as Networks D and E are more at risk to eontagion than 
networks with low density sueh as Networks A, B, and F. Network F, although it has the largest 
maximum degree, has the same density as Network A. It is diffleult for infeetion to spread in 
Network F beeause the eenter agent is the only agent eapable of transmitting the infeetion to 
its neighbors. We showed in lITOl that the nine infeeted agents in Network C are more at risk 
of infeetion than the other agents beeause they form a subgraph that is denser than the overall 
network; these nine agents are espeeially well-eonneeted in this network. 



TVD(7re, TTapprox) 

K 

X* 

-^approx 

TTefx*) 

TTapprox ( X^pprox ) 

Network A 

0.0266 

x° = [0,0,...0]^ 

x° = [0,0,...0l^ 

6.7989 X 10"® 

6.4085 X 10“® 

Network B 

0.029 

x° = [0,0,...0]^ 

x^ = [i,i,...ir 

6.2942 X 10"® 

6.1972 X 10“® 

Network C 

0.0848 

see Figure [6cl 

x^ = [l,l,...l]^ 

7.0847 X 10"® 

1.214 X lO""' 

Network D 

0.1505 

x^ = [i,i,...ir 

x^ = [i,i,...ir 

2.5957 X 10-* 

0.0011 

Network E 

0.6652 

x^ = [i,i,...ir 

x^ = [i,i,...ir 

0.0066 

0.1849 

Network F 

0.1609 

x° = [0,0,...0]^ 

x^ = [0,0,...0l^ 

5.988 X 10"® 

3.7915 X 10“® 


TABFE II: Most-Probable Configuration when ^ = 0.7 and A = 0.4333. 


Table |n] lists the TVD between the distributions, the most-probable eonfigurations for the 
extended eontaet proeess and the approximate sealed SIS proeess, and the probabilities of the 
most-probable eonfigurations for ^ = 0.7 and A = 0.4333; the faetor. A, no longer satisfies the 
eondition of Theorem llV.li As a result, the TVD between the distributions are larger and for 
Networks B and C, the most-probable eonfigurations of the extended eontaet proeess and the 
approximate sealed SIS proeess are no longer the same. One intuition as to why for Network C, 
^approx = (the eonfiguration where all the agents are infeeted) while x* = x° (the eonfiguration 
where all the agents are healthy) is beeause the endogenous infeetion rate of the sealed SIS is 
exponentially dependent on the number of infeeted neighbors while the endogenous infeetion 
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rate of the extended eontaet proeess is linearly dependent on the number of infeeted neighbors; 
eontagion is more virulent beeause the endogenous infeetion rate is higher in the sealed SIS 
proeess. 

Note that the eonfiguration in Figure where nine agents are more at risk of infeetion than 
others, remains the most-probable eonfiguration for Network C. Even though this eonfiguration 
no longer has the highest equilibrium probability in the approximate distribution, it remains a 
highly probable eonfiguration. This reinforees our observation from Figure IH that eonfigurations 
with high probabilities in the approximate distribution are also highly probable in the equilibrium 
distribution of the extended eontaet proeess. The substruetures that are vulnerable for the sealed 
SIS proeess, the non-degenerate most-probable eonfigurations, are also vulnerable substruetures 
of the extended eontaet proeess. 


VII. Conclusion 

This paper eonsiders eonditions under whieh the extended eontaet proeess [|T1|, [fT^ is well 
approximated by a sealed SIS proeess [|9]|. This is important beeause the extended eontaet proeess 
ean only be studied by numerieal means, exeept in the trivial eomplete network, whereas for 
the sealed SIS proeess, we have a elosed-form solution for its equilibrium distribution. Both 
proeesses model Markov dynamie proeesses over a statie, undireeted network. The extended 
eontaet proeess, also ealled the e-SIS proeess, is a modifieation of the basie eontaet proeess O 
to inelude a nonzero exogenous infeetion rate. The eontaet proeess is often used as model of 
diffusion of virus or information over networks. 

We are interested in understanding how the time-asymptotie behavior of dynamieal network 
proeesses in partieular, of the extended eontaet proeess, depends on the underlying network 
topology. The equilibrium distribution of the extended eontaet proeess, although well-defined, 
requires solving numerieally an eigenvalue-eigenveetor problem for a 2^ x 2^ matrix, whieh is 
infeasible exeept for very small size networks. It is also not analytieally available for arbitrary 
network topologies. In this paper, we show how the equilibrium distribution of the sealed SIS 
proeess, whieh does have a elosed-form analytieal deseription dH, is, under a eertain range of 
parameter values, a good approximation to the equilibrium distribution of the extended eontaet 
proeess. 

The extended eontaet proeess assumes that the infeetion rate of a suseeptible agent has a 


July 3, 2015 


DRAFT 


23 


linear dependence on the number of infected neighbors, whereas the scaled SIS process assumes 
that the infection rate is exponentially dependent on the number of infected neighbors. The 
paper gives a conservative upperbound on the endogenous infection rate, (3^, of the extended 
contact process for which the equilibrium distribution of the extended contact process, 7re(x) is 
well approximated by that of an equivalent scaled SIS process, 7rapprox(x). We showed that this 
upperbound depends on the maximum degree, dmax, of the underlying network topology. 

We confirmed these results with simulations using six different networks with 16 nodes (for 
which can solve numerically the associated eigenvector-eigenvalue problem). We compared the 
true equilibrium distribution, 7re(x), of the extended contact process, obtained numerically, with 
that of the approximate distribution, 7rapprox(x), derived from the scaled SIS process. By checking 
the deviation for parameters both within and outside the theoretical bound, we confirmed that 
the proposed approximation is good. Depending on the underlying network, the total variation 
distance (TVD) between the true equilibrium distribution and its approximation may still remain 
small even for processes where the parameter values are larger than the upperbound. Further, 
the TVD does not necessarily increase for increasing deviation from the theoretical upperbound. 
In future work, we would like to explore how the structure of the transition rate matrices leads 
to a decrease in deviation between the extended contact process and the approximation. 

We then used this approximation to determine the most-probable configuration of the extended 
contact process. Unlike the scaled SIS process, we do not have any bounds for the extended 
contact process on the infection and healing rate as to when the most-probable configuration 
is x°, x^, or a non-degenerate configuration. When the TVD is small, the configurations with 
the highest equilibrium probability are identical for both the extended contact process and the 
scaled SIS process. This means that we can use the scaled SIS process, whose most-probable 
configuration can be found in polynomial-time, to find the most-probable configuration of the 
extended contact process. The most-probable configurations reveal subgraphs in the network that 
are more vulnerable to infection by the extended contact process. 
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Appendix A 
Proof of Lemma Hvn 

Lemma A.l. For any nonnegative integer m from 0 to if 

2 


« 


dmux ( draax 1 ) 


then, for /? = 1 + A, 


-13^ = -(1 + A)”^ ^ - + -Am. 

/i /i /i /i 


Proof: From the binomial series, for integer m G {0,1,... 


-r = -(1 + A)" = -1 y 

gL jj, /i ' ^ 


,k=0 


m 


A^ 


\ f fm 


/i \ \ 0 

A 


m 


A'^y ( A + 


m 


A^ + 


A^... + 


m 


m 


A" 




1 + mA + 


mini — 1) ^^ ^ 1 
—^^ A^ + ... m/Sr-^ + A" 


If 


"^Na 2 /"^']A3 


A^ I << 1, Vm e {0, 1, . . .rfmax}, 


( 11 ) 


then the quadratie and higher order terms in the summation are negligible and we obtain the 
linear approximation 

-(3^ ^ - + -Am, 

/i ^ fi 


whieh holds for all m. 

Reeognize that for m G {0,1,... dmax}, fn > 2 


^ > ... > This means that 


A^ > 


A^ > ... > 


m 


a", Vm G {0,1,.. .dmax}. 


The largest possible upperbond is when m = dmax- Therefore, eondition (fTTI) is satisfied when 

f^max (^max -^A^ << 1. 
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Theorem B.l. Consider the extended contact process exogenous infection rate X, healing rate 
fi, and endogenous infection rate j3e, over a static, simple, connected, undirected network of 
arbitrary topology, G, with maximum degree dmax- Let f3e = -ds. If 


« 


2 


dma.y.{dn 


1 )’ 


then the equilibrium distribution of the extended contact process is well approximated by 

T^appwxi^) = ^ ( - 1 (1 + A) 2 , X G T", 

where A is the adjacency matrix of the network G, and Z is the partition function. The 
approximate distribution, TiapproxipL), is the equilibrium distribution (|7]) of an equivalent scaled 
SIS process over the same network G with exogenous infection rate \, healing rate p, and 
endogenous infection rate = 1 + A. 


Proof: From the theory of continuous-time Markov processes [fTTI . the equilibrium distri¬ 
bution of the extended contact process is the left eigenvector of the transition rate matrix, Qe, 
corresponding to the 0 eigenvalue: 

vrQe = 0 

Entries of the matrix Qg correspond to the transition rates from one configuration x G Tf to 
another configuration according to the rates (H]) and (|3]). 

Lemma ITV. 1 1 gave the condition for when the infection rates (normalized by the healing rate) of 
the extended contact process are approximately the same as those of the scaled SIS process. As a 
result, the transition rate matrix of both processes are approximately the same. Therefore, the left 
eigenvector corresponding to the 0 eigenvalue of Qe is also the left eigenvector corresponding to 
the 0 eigenvalue of Qs, the transition rate matrix of the scaled SIS process with entries generated 
according to dH) and dS]). We know that the left eigenvector of interest for the rate matrix, Q^, 
of the scaled SIS process is given by the closed-form equation db]). 
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Appendix C 

Properties of Functions of Exponentially Distributed Random Variables [[201i 
A. Two Independent Random Variables 

Let A ~ exp(Q;), B ~ exp(/3) be two independent exponentially distributed random variables. 
Then, 


P{A <B) = P{A- B < 0) 

poo pb 

= / / 


'0 ^0 
roo 


'0 \^0 
poo 


ae /5e 


(1 - 

) poo 

f3e-^^db- / 


= 1 


/S 


ct (I 


a 


o T /9 


B. Multiple Independent Random Variables 

Let A ~ exp(Q!), Bi ~ exp(/3i), B 2 ~ exp(/32), ■■■Bm ~ exp(/3m) be independent ex¬ 
ponentially distributed random variables. Let C = min{i?i, 52,..., from properties of 
the exponential distribution, C is also an exponentially distributed random variable with rate 

/^1 + /32 + • • • + Pm- 

Therefore, 

P{A <C) = P{A -C <0) 

_ ^ _ f /3i+/?2 + -- -+ Pm A 
\ a /?2 + • ■ ■ + Pm ) 

a 

a p Pi -\- P 2 ■ P Pm 

The proof follows by induetion on the number of independent exponentially distributed random 
variables. 
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